Show the code
import pandas as pd
import numpy as np
from lets_plot import *
LetsPlot.setup_html(isolated_frame=True)import pandas as pd
import numpy as np
from lets_plot import *
LetsPlot.setup_html(isolated_frame=True)For Project 1 the answer to each question should include a chart and a written response. The years labels on your charts should not include a comma. At least two of your charts must include reference marks.
import os
os.getcwd()
from p1_source import my_name_plot_data
# my_name_plot_datasource code available at p1_source.py
How does your name at your birth year compare to its use historically?
my name, “Dallin” occured most about 3 years after I was born, but the name had been trending up from obscurity for almost 8 years before my birth in 1996.
from p1_source import my_name_plot
my_name_plotIf you talked to someone named Brittany on the phone, what is your guess of his or her age? What ages would you not guess?
The dataset indicates that Britanys of age 35 (in 2025) are most common.There are very few people by that name older than 50, and only a few younger than 19.
from p1_source import brittany_plot
brittany_plotMary, Martha, Peter, and Paul are all Christian names. From 1920 - 2000, compare the name usage of each of the four names in a single chart. What trends do you notice?
all names have a dip starting after about 1925, and a massive boom during the war years, with a peak after the end of the war.
from p1_source import q3_data_plot
q3_data_plotthis graph attempts to normalize the data relative to their frequency in 1920. they seem to follow a really similar pattern, but with peter being the most strongly affected
from p1_source import q3_data_plot_alt
q3_data_plot_altThink of a unique name from a famous movie. Plot the usage of that name and see how changes line up with the movie release. Does it look like the movie had an effect on usage?
Neo from the Matrix (1999) has no occurances before 1999. This corrolation might indicate a causal relationship between the release of the movie and the occurance of the name. There seems to be no signficant decrease in the name’s frequency in the 1.5 decades of data since the movie’s release. Maybe matrix sequel movies had an effect. But both of the two sequels were released in ’03 (excepting the 2021 reboot or whatever it was). The name seems relatively uncommon but I didn’t do a comparison to aggregate frequencies for other names in the set
# Include and execute your code here
from p1_source import movie_name_plot, movie_name_data
movie_name_plotmovie_name_data| name | year | Total | |
|---|---|---|---|
| 287684 | Neo | 1999 | 7.0 |
| 287685 | Neo | 2000 | 50.0 |
| 287686 | Neo | 2001 | 64.0 |
| 287687 | Neo | 2002 | 25.0 |
| 287688 | Neo | 2003 | 51.0 |
| 287689 | Neo | 2004 | 67.0 |
| 287690 | Neo | 2005 | 46.0 |
| 287691 | Neo | 2006 | 70.0 |
| 287692 | Neo | 2007 | 43.0 |
| 287693 | Neo | 2008 | 30.0 |
| 287694 | Neo | 2009 | 46.0 |
| 287695 | Neo | 2010 | 30.0 |
| 287696 | Neo | 2011 | 28.0 |
| 287697 | Neo | 2012 | 29.0 |
| 287698 | Neo | 2013 | 26.0 |
| 287699 | Neo | 2014 | 52.0 |
| 287700 | Neo | 2015 | 38.0 |
Reproduce the chart Elliot using the data from the names_year.csv file.
I created the chart and used some ggplot tooling to attempt to adjust to look & feel of the chart to match the example. I couldn’t figure out labels for vertical lines. the geom_vline() doesn’t seem to support labels and i was having trouble with geom_text() causing an entire render failure
# Include and execute your code here
from p1_source import elliot_data, elliot_plot
elliot_data| name | year | Total | |
|---|---|---|---|
| 118687 | Elliot | 1911.0 | 5.0 |
| 118688 | Elliot | 1912.0 | 6.0 |
| 118689 | Elliot | 1913.0 | 11.0 |
| 118690 | Elliot | 1914.0 | 24.0 |
| 118691 | Elliot | 1915.0 | 22.0 |
| ... | ... | ... | ... |
| 118787 | Elliot | 2011.0 | 891.5 |
| 118788 | Elliot | 2012.0 | 1042.5 |
| 118789 | Elliot | 2013.0 | 1064.5 |
| 118790 | Elliot | 2014.0 | 1199.0 |
| 118791 | Elliot | 2015.0 | 1250.0 |
105 rows × 3 columns
elliot_plot